Monographs 


General and Applied 


A Technique for the Development of 4 
Differential Prediction Battery 


By 


— 
2 
“ 


Paul Horst 


University of Washington 


Price $1.00 


Edited by Herbert S. Conrad 
Published by The American Psychological Association, Inc. 


No. 380 
| 
Vol. 68 
No. 9 


es 


is and the A nology 
Monographs” 


Editor 
S. Conrap 
— of Health, Education, and Welfare 


Office 0) Education 
a5, D.C, 


BouTHILeT te 3 
eine Editors 


Haron E, Jonrs 
Donatp W. MacKinnon 
Lorrin A. Riccs 
Cart R. Rogers 
Saut Rosenzweig 
Ross STAGNER 
Percivau M, SyMonps 
Joseru TrrFin 
Lepyarp R Tucker 
_Josern Zusin 


to the Editor. 

Monographs can print only 
: 3 the author. Background and bibliographic materials 
be or kept to an irreducible miniraum, Statistical 
‘bles should be used to | cconaie Only the most important of the statistical data or 


ee The first page af the sseniscript should contain the title of the paper, the author's 
‘name, and his institeciogal, (or his city of residence). Acknowledgments 
should be kept brie, Mad Sopear ss a footnote on the first page. No table of contents 
eed be included. For other. @irections or suggestions on the preparation of manu- _ 
actipts, ste: Connan, | Preparation of manuscripts for publication as 
graphs. J. Poychol., 1948; 447-459. 
BUSINESS MATTERs (such as author's fees, 
and sola, etc.) should be addressed to the American Payelspiogical 
Geateenth St. N.W., Washington 6, D.C, Addnem 
ie by the Month to take effect the following month. Undelivered 
resulting changes will not be replaced; subscribers should notify 


Vol. 68, No. 9 


Whole No. 380, 1954 


Psychological Monographs: General and Applied 


A Technique for the Development of a 
Differential Prediction Battery’ 
Paul Horst 
University of Washington 


I. THe DIFFERENTIAL PREDICTION 
PROBLEM 


ETHODS for developing a test battery 
to predict success in a single speci- 
fied type of activity are well known. These 
methods are concerned primarily with 
selecting tests to predict a single criterion. 
But in most practical situations we are 
not concerned with predicting success in 
just a single activity or assignment. In 
the military service it is not a question of 
predicting the success of a prospective 
recruit in just a single specialty. Rather, 
the military services are concerned with 
predicting success in all or most of their 
numerous (military occupational) special- 
ties. In an industrial organization, it is 
not enough to have procedures for pre- 
dicting success in merely a single type of 
job. It is desirable to have procedures for 
predicting success in as many of the 
different jobs existing in the company as 
possible. 
Similarly in our institutions of higher 
education it is not enough for an entering 
student to know simply whether he will or 


' This research was carried out under Contract 
Nonr-477(08) between the University of Washing- 
ton and the Office of Naval Research. The data used 
to illustrate the technique were provided by Profes- 
sor August Dvorak. Most of the computations were 
carried out by Robert Dear and William Meredith. 
Charlotte MacEwan also assisted with the computa- 
tions and assumed major editorial responsibility for 
the preparation of the manuscript. Much credit is 
due the typist, Eleanor Green. Supervision of both 
computational and editorial activities was provided 
by William Clemans. To each of these able con- 
tributors I am deeply grateful. 


will not do well as an English major. It is 
also important for him to know as accu- 
rately as possible how well he will do in all 
other areas of instruction open to him. 

It is believed that a more efficient uti- 
lization of our manpower resources de- 
pends very largely on the improvement of 
techniques for predicting differentially the 
success of persons in each of a possible 
number of important activities. 

Clearly, however, it is not feasible to 
attempt to develop a separate battery of 
tests to predict success in each of the 
numerous military occupational special- 
ties or each of the many jobs within a 
large company or each of the dozens of 
areas of instruction within our institutions 
of learning. Any intensive or ambitious 
attempts along this line would lead natu- 
rally to a very large number of test bat- 
teries which each person would have to 
take in order to get predictions of success 
in each assignment for which he might be 
considered. 

Ideally we should have a single classi- 
fication battery of tests which, by means 
of differential weighting procedures, would 
enable us to predict success in each of a 
wide variety of activities or specialties. 

Such a classification battery should pre- 
sumably predict as accurately as possible 
differences in success between all possible 
pairs of activities if it is to be maximally 
useful for classification or differential pre- 
diction purposes. For a wide variety of 
assignments this would ordinarily presup- 
pose a heterogeneous battery of predictors, 


PAUL HORST 


In the development of a differential 
prediction battery one would presumably 
start with a somewhat larger experimental 
battery of predictors than would be used 
for administrative purposes. In this re- 
spect the approach is the same as for the 
development of a battery to predict a 
single criterion, Presumably one would 
then specify a maximum number, less 
than the total number of predictors, which 
would be permitted in the final prediction 
battery. The problem then would be to 
select those particular tests, no greater in 
number than prespecified, which would 
most accurately predict differences in suc- 
cess for all possible pairs of activities. 

Various aspects of the problem of differ- 
ential prediction have been treated by 
Brogden (2), Mollenkopf (7), Thorndike 
(9g), and Wesman and Bennett (11). A 
problem closely related to that of differ- 
ential prediction is that of differential 
classification. In this problem it is as- 
sumed that indices of potential success are 
already available for each of a group of 
individuals in each of a number of activi- 
ties, and that a specified proportion of the 
group is to be assigned to each activity. 
Here the problem is to assign the persons 
to the activity groups in accordance with 
the quotas and so as to maximize the sum 
of all criterion indices corresponding to the 
job assignments. This problem has been 
considered by Brogden (1), Lord (6), 
Thorndike (9), and Votaw (10). It is, 
however, beyond the scope of this article 
to consider in detail the relationships be- 
tween the problems of differential predic- 
tion and differential classification. It is the 
purpose of this article to present a method 
for selecting from a relatively large num- 
ber of predictors that subset of specified 
size which will yield the most accurate 
predictions of differences between all pairs 
of criterion measures within a specified 
set. It is believed that no such technique 


has been previously proposed by any of 
the investigators referred to above. 

The method assumes that we have 
available adequate estimates of the inter- 
correlations of all the potential predictor 
variables for a given population and also 
that we have adequate estimates of the 
correlation of each of these potential pre- 
dictor variables with each of the criterion 
variables for the same population. Ac- 
tually, if the number of criterion variables 
is large, it will rarely be the case that for 
a single sample of individuals measures on 
all criterion variables will be available for 
each person, even though measures on 
each of the predictor variables may be 
available for each person in the sample. 
For example, a battery of predictor tests 
may be given to a group of military re- 
cruits or entering college freshmen. Sub- 
sequently, however, any given recruit will 
engage in only a limited number of mili- 
tary occupational specialties and, like- 
wise, a college entrant will take only a 
limited number of the total available 
courses, Consequently, in either case, cri- 
terion measures for all of the criteria will 
not be available for all persons on whom 
predictor measures are available. 

Wesman and Bennett (11) have pro- 
posed that this obstacle to an adequate 
solution of the differential prediction 
problem be met by setting up experi- 
mental situations in which a large group 
of persons will be given a predictor bat- 
tery. Subsequently each member of the 
group will participate in each of a large 
number of activities so that criterion 
measures may be obtained on each cri- 
terion for each member of the group. Such 
an ambitious approach to the problem ap- 
pears neither adequate nor necessary. 
Furthermore, it does not appear likely 
that a single individual could participate 
in a large enough number of activities for 
a sufficient length of time to yield reliable 


— — 
, 


DIFFERENTIAL PREDICTION 3 


criterion measures for all criteria. Prob- 
lems concerning transfer and retroactive 
inhibition would most certainly enter in. 
It is probable that available techniques 
such as those outlined by Gulliksen (4) 
and others, which might be developed for 
adjusting correlations for variation in dis- 
persions, offer a more fruitful and practical 
solution to this aspect of the differential 
prediction problem, In any case, however, 
the scope of this presentation does not in- 
clude further consideration of this par- 
ticular aspect of the problem. 

In order to develop a method for select- 
ing that subset of predictors of specified 
size which will yield the most accurate 
predictions of differences between all pairs 
of criterion measures we must first define 
mathematically an index of differential 
prediction efficiency of a test battery, To 
do this we start with an assumption which 
in actual practice need not be satisfied. 
We assume that for each person we have 
difference scores on all possible pairs of 
criterion measures within the set; these 
scores are assumed to be standard meas- 
ures as is the case throughout the discus- 
sion. By conventional methods we may 
obtain the best estimate in the least-square 
sense for the difference scores derived 
from all possible pairs of criterion vari- 
ables. The index of the differential predic- 
tion efficiency of the battery is taken to be 
a simple function of the average of the 
variances for the predicted difference 
scores for all possible pairs of criterion 
variables. The larger this average variance 
the greater the differential prediction 
efficiency of the battery. 

As will be shown ina later section, this 
index is equivalent to the difference be- 
tween the average variance of the pre- 
dicted criterion measures and the average 
of their covariances, assuming standard 
measures for both predictors and criteria 
and that the predicted criteria are the 


“least-square” estimates. The problem 
then is to select that subset of predictors 
of specified number for which the differ- 
ence between average variance and co- 
variance of the predicted criterion meas- 
ures is largest. 

Without attempting to answer in detail 
the concern of Wesman and Bennett (11) 
that differential prediction procedures 
may lose much value in the way of abso- 
lute prediction, it may be noted that, as- 
suming average covariances of predicted 
criterion measures equal, the battery of 
tests having the highest average squared 
multiple correlation with all the criteria 
will yield the greatest differential predic- 
tion. This must, of course, be true because 
the variance of a least-square predicted cri- 
terion is precisely the square of the mul- 
tiple correlation of the criterion with the 
predictors when standard units are em- 
ployed. 

The method to be outlined proceeds by 
selecting one predictor at a time. The first 
predictor selected is the one which by it- 
self yields the highest index of differential 
prediction efficiency. The second predictor 
selected is the one which, when combined 
with the first, yields the highest index of 
differential prediction efficiency. The 
third predictor selected is the one which 
when combined with the first two yields 
the highest index of differential prediction 
efficiency. This process is continued until 
the desired number of predictors has been 
selected. 

The question may well be raised 
whether for a subset of S predictors, the 
procedure indicated would be the same as 
if one had tried all possible subsets of S 
predictors and selected the one with the 
highest index of differential prediction. 
There appears to be no definitive answer 
to this question other than actually to 
compute, for a given set of data, indices of 
differential prediction efficiency for all 


4 PAUL HORST 


possible subsets of predictors; and then for 
all subsets of a given size, to compare the 
results with those obtained by the method 
to be outlined. The labor involved in em- 
pirical tests such as this would doubtless 
be prohibitive even with high speed elec- 
tronic computors. Ever. if the computors 
were sufficiently rapid it is still probable 
that the labor in programming and in 
processing the data would be prohibitive. 

In any case the empirical approach 
would answer the question only for a given 
set of data. Less complete empirical tests 
could of course be made. For example, one 
could use the progressive selection meth- 
od, as indicated, to get a subset of pre- 
dictors, One could then repeat the pro- 
cedure except that he would start with 
some predictor in the selected subset other 
than the previous first selection. If such 
a variation of the original procedure 
tended to yield the same subset as the 
original procedure, irrespective of the 
initial selection, this could be regarded as 
evidence that the proposed method tended 
to select that subset of specified size with 
the highest index of differential prediction 
efficiency. Very limited results indicate 
that such variations in procedure actually 
do tend to select the same subset. It 
would seem desirable to conduct further 
empirical and theoretical investigations 
along this line. However, it is of interest 
to point out that widely used single cri- 
terion predictor selection methods such as 
those developed by Horst (5) and Wherry 
(8) are based on this same principle of 
selecting one predictor at a time. 

It will be proved in a later section that 
the maximum difference between the aver- 
age variance and covariance of predicted 
criterion measures, i.e., the maximum 
differential prediction efficiency of a bat- 
tery, cannot exceed the difference between 
unity and the average intercorrelation of the 
criterion measures. However, the inter- 


correlations of the actual criterion meas- 
ures are not required in the predictor se- 
lection procedure. This is fortunate since 
in most typical situations they will not 
be available. 

The remainder of this presentation will 
consist of three parts. First, we shall de- 
scribe the computational procedure. Sec- 
ond, we shall give a numerical example of 
the procedure. Third, we shall give a 
mathematical proof of the method. 


II. Toe ComputaTIONAL METHOD 
A. Selecting the Differential Predictors 


Assume that we have the matrix of in- 
tercorrelations of potential predictors, m 
in number, and the matrix of correlations 
of the » predictors with the criterion vari- 
ables, V in number. We let 


ig=the (mXn) matrix of intercorrela- 
tions of the n predictors, 

=the (nN) matrix of intercorrela- 
tions of the m predictors with the 
N criteria. 


It is clear then that each column of ,T 
represents a criterion variable and each 
row a predictor variable. The steps in the 
procedure correspond to those illustrated 
in Part III, Section B. Each step may be 
clarified by referring to the corresponding 
numerical illustration. It will be noted 
that in the numerical example, the ,T’ 
matrix is used rather than ,7. In fact, the 
transpose of ;7 is used throughout in the 
numerical computations as a more con- 
venient form for accommodating all the 
matric elements and the relevant compu- 
tations on one worksheet. This form is 
especially convenient if the number of 
criterion variables is large. Consequently 
operations defined in the current sections 
for rows of the T matrices have been per- 
formed, in the example, upon columns. 

The steps are as follows: 

1. We start by computing N times the 


DIFFERENTIAL PREDICTION 5 


variance of each row of the ,7 matrix by 
the usual formula, and call these values 
14. For the kth row this would be 


- 


In matrix notation Equation 1 is 


where 


iT;,.’ is the kth row vector of ,T, 
1’ is an Nth order row vector with all 
unit elements. 


We now take the predictor variable cor- 
responding to the largest ;A as the first 
predictor in the subset to be selected and 
designate this as predictor a. 

2. Compute a new matrix og from the ig 
matrix and its ath row or column by the 
scalar formula 


= 18kp— Bap [3] 


Equation 3 means that the element in the 
kth row and pth column of 2g is obtained 
by subtracting from the corresponding 
element of ig the product of the element 
in the ath row and &th column of ig by the 
element.in the ath row and pth column 
of ig. The ath column of 2g will have all 
zero elements since all such elements will 
be of the form 


But since ig is a correlation matrix, 

aa = 1; it is easily demonstrated that 2g is 

also symmetrical and therefore its ath 

row must also have all zero elements. — 
In matrix notation Equation 3 is 


2f * 1818.6 18.05 [4] 


where ig.. is the ath column vector of ig 
and 1g.,’ is its transpose. 
3. Compute a new matrix »7 from the 


iT matrix, its ath row and the ath row 
of the 1g matrix by the scalar formula 


iT kp— iT ak Bap - [5] 


Equation 5 means that the element in the 
kth row and pth column of 27 is obtained 
by subtracting from the element in the 
corresponding position of ,T the product 
of the element in the ath row and kth 
column of ,J by the element in the ath 
row and pth column of ig. The ath row of 
2T will have all zero elements since all 
such elements will be of the form 


2T ak = iT 1T ar - 


But igo is the ath diagonal element of the 
ig matrix, and all of these diagonal ele- 
ments are unity. Actually then, it is not 
necessary to calculate the ath row of »7. 
In matrix notation Equation 5 is 


oT =1T Ves [6] 


where 


1T,,.’ is the ath row vector of ,7, 
18.2 is the ath column vector of xg. 


We could, of course, take either the ath 
column vector of ig or its transpose, the 
ath row vector, since ig is symmetrical. 

4. Compute N times the variance of 
each row of »T obtained in step 3 by means 
of the scalar formula 


In matrix notation Equation 7 is 


where the notation is analogous to that 
defined for Equation 2. 

5. Divide each of the »S,? values com- 
puted in step 4 by the corresponding di- 
agonal element of »g obtained in step 2 


| 


6 PAUL HORST 


and designate the kth ratio as »A,. In 
scalar notation this is 
Si? 
‘ lo] 
kk 
In matrix notation Equation 9 may be 
written 


Dy =Dy'D,s, [10] 


where 


D,y is a diagonal matrix of the »A,, 

D,, isa diagonal matrix of the diagonal 
elements of og, 

D,s? is a diagonal matrix of the »S;,”’s. 


Take the predictor variable corresponding 
to the largest »A, as the second predictor 
in the subset to be selected and designate 
this as predictor b. 

6. Divide each element of the dth row 
of og by its bth diagonal element and desig- 
nate the ratio by Gx. In scalar notation 
this is 
28 bk 


28 bb 


In matrix notation this may be written 


[x2] 


28 b. 


28 bb 


Gi = 


7. Compute a new matrix 3g from the 
2g matrix of step 2, its bth column, and 
2G» of step 6 by the scalar formula 


[13] 


Equation 13 means that the element in 
the kth row and pth column of 3g is ob- 
tained by subtracting from the corre- 
sponding element of 2g the product of the 
element in the 6th row and &th column of 
2g by the pth element of »G,. Both the ath 
and the dth rows and columns of 2 will 
have all zero elements. In matrix notation 
Equation 13 is 


= Boe - 


[14] 


= — 28.0 


[xr] 


8. Compute a new matrix 37 from the 
2T matrix, its bth row, and the vector from 
step 6 by means of the scalar equation 


ep=2T kp—2T vk Gop - [15] 


Equation 15 means that the element in 
the &th row and pth column of the 37 
matrix is obtained by subtracting from 
the corresponding element in the JT ma- 
trix the product of the element in the dth 
row and kth column of »T by the pth ele- 
ment in the Gy» vector. 

The bth row as well as the ath row of 3T 
will now have all zero elements just as 
the ath row in the J matrix had all zero 
elements, 

In matrix notation Equation 15 may be 
written 


= oT — 16| 


9g. Compute NV times the variance of 
each row of 37 obtained in step 8 by means 
of the scalar equation 


Si? = 


N 4 


In matrix notation Equation 17 is 


10. Divide each 3S,? computed in step 9 
by the corresponding diagonal element of 
ag obtained in step 8 and designate the 
kth ratio as sA,. In scalar notation this is 


[19] 


or in matrix notation, 
= [20] 


where the notation is defined as in Equa- 
tion ro. 

Take the predictor variable correspond- 
ing to the largest A, as the third predictor 


DIFFERENTIAL PREDICTION 7 


in the subset to be selected and designate 
this as predictor c. 

11. Repeat steps 6 through 10 to get 
the fourth and subsequent predictors of 
the subset. We let w be the ith selected 
predictor. Then the general equations 
representing the 5 steps required for se- 
lecting all predictors after the first two are 
given in scalar notation by the following 5 
equations respectively. 
(1) 


(2) Gti Skp= ikep— 
(3) Tep=iTep—iT wor Gup, 
(4) 


[21] 
[22] 


[23] 
[24] 
[25] 


In matrix notation the corresponding 
equations are 


(5) 


+p 


(1) 


(2) 
(3) 
(4) 


[26] 


[27] 
[28] 


(i+ 1) 8 = if .w ie’. 
=iT— iTe.’, 
= Te city Te. 
(5) - [30] 


It should be noted that each new ,T 
matrix has one more vanishing row than 
the preceding one. Similarly each new ig 
matrix has one more vanishing row and 
column than the preceding one. In par- 
ticular the number of vanishing rows in 
iT is i—1 and the number of vanishing 
rows and columns in xg is also i—1, 

Steps 6 through ro are repeated until 
the desired number of predictors has been 
selected. This number may be determined 


[29] 


on the basis of administrative considera- 
tions such as the total amount of testing 
time available or limitation on the facili- 
ties for processing the data. 

The selection procedure may also be 
terminated if it is believed that the A 
values are too small to justify adding an 
additional variable. As will be proved in 
Part IV the sum of all the A’s for the se- 
lected variables is given by 


o1= Aat Art tidw 


[31] 


where 


iC is the average of the squared mul- 
tiple correlations for the criteria 
with the i selected predictors, i.e., 
the average variance of the pre- 
dicted criterion measures, 

Cyp is the average covariance of the 
predicted criterion measures esti- 
mated from the i selected predic- 
tors. 


It will be recalled that the right-hand 
parenthesis on the right of Equation 31 is 
precisely what we have defined as the in- 
dex of differential prediction efficiency. 
Therefore, at each cycle the corresponding 
highest A indicates how much it is possible 
to increase the differential prediction 
efficiency of the subset already selected by 
adding the best of the remaining unse- 
lected tests. However, no attempt has been 
made to develop tests of significance for 
the A’s. In lieu of such tests administra- 
tive and subjective considerations must 
largely determine when enough predictor 
variables have been selected. 


B. Solving for the Regression Vectors 


We have seen how the predictors for a 
subset of specified size may be selected to 
yield a differential prediction battery. 
Once this subset is selected, however, we 
still must solve for the weighting vectors 


8 PAUL HORST 


corresponding to the criteria to be pre- 
dicted. We have said that these vectors 
will be the least-square regression vec- 
tors. Presumably then we could by con- 
ventional methods such as those outlined 
by Dwyer (3) solve for each of the regres- 
sion vectors as a function of the intercor- 
relations of the selected predictors and 
their correlations with the corresponding 
criterion. However, the labor may be 
greatly reduced by utilizing certain of the 
computations already performed in the 
predictor selection process. 

1. We begin by making up a matrix, L, 
from certain of the elements in the G vec- 
tors indicated in Equations 21 and 26, 
The first element in the first row is taken 
as zero. The remaining elements in the first 
row of this matrix are taken from elements 
in the ath row of the ig matrix, These 
elements are those corresponding to all 
but the first of the selected predictors. 
They are arranged in the order in which 
the predictors were selected and their 
signs are reversed. 

The first row of the L matrix we shall 
call L,.’ and its scalar elements may be 
indicated by 


La.’ =(0, ~ 1800, ). [32] 


The first two elements of the second row 
of the L matrix are taken as zero. The re- 
maining elements of the second row are 
obtained from those elements in the .G 
vector (see Equations 11 and 12) cor- 
responding to all but the first two se- 
lected elements. These elements are ar- 
ranged in order of selection and their 
signs are reversed, This second row may 
be indicated by 


Ly.’ =(0, — Gre, —Goa ). [33] 


The third row of the Z matrix is similarly 
obtained from the 3G vector so that 


Assuming four selected predictors, the 
L matrix may be indicated by 


Lab | Loa 
Loa 


° Loe 


[35] 


° 
° 


where the elements in the rows are indi- 
cated by Equations 32, 33, and 34 re- 
spectively. We note that all elements of 
L in the diagonal and below are zero. 

2. Next we make up a matrix the same 
order as L in Equation 35 which we shall 
call F’, This matrix has all zero ele- 
ments above the diagonal. The diagonal 
elements are the reciprocals of diagonal 
elements taken from the jg matrices. 
These are 


To indicate how the remaining elements 
are obtained we shall assume four selected 
predictors and write the F’ matrix as 


Fle Fao Fun 

Fae Fe 

Fea F aa 


[37] 


The first row of F’ is already given since 
Fis given by Equation 36. For the sec- 
ond row of F’ we need calculate only F.» 
since Fy» is given by Equation 36. We cal- 
culate 

wo. [38] 


The elements in the third row are calcu- 
lated from right to left beginning with F,, 
since F,, is given by Equation 36. We have 


ce ) 


Foc = Lash vet Lack cc [39 


= 
1 
P.=——= 1.00] 
28 bb [36] 
1 
F,.=—— 
etc. 


Similarly the elements in the fourth row 
are calculated from right to left. These are 


LeaF aa 
cat Loak aa 
cat Lack cat Leak aa} 


It will be noted that the F values required 
for solving for the left side of a given equa- 
tion in 40 have been solved for in the pre- 
ceding equations, For those who are fa- 
miliar with methods forsolving linear equa- 
tions the operations outlined will be recog- 
nized as the ‘back solution.”’ 

In matrix notation, the “‘back solution” 
for F’ may be expressed as the product of 
two supermatrices, 


[40] 


Fue 


Far 


where all the numerical values are known 
except those of the nonzero elements of 
F’, In this form the equation demon- 
strates that the diagonal elements of F’ 


1 va 


La 


DIFFERENTIAL PREDICTION 


(F’| (=) =F’ 


where the unknown elements of F’ ap- 
pear on the left as well as the right side of 
the equation. The known elements are 
those of: L’, defined by the transpose of 
Equations 31 through 34; the identity 
matrix; the diagonal matrix D,,~' since the 
diagonal elements are known to be those 
on the right side of Equation 36. The ele- 
ments above the diagonal of F’ are known 
to be zero, since is a “lower triangular” 
matrix, 


Written in element form for four se- 
lected predictors, the above equation be- 
comes 


° 


are those of D,,~', and that no other row- 
by-column multiplication of D,~'J con- 
tributes to the elements of F’, For com- 
putational purposes, we may write 


° ° ° 


Liss ° ° 


Loa Lea 


9 
Mo ° ° ° ° ° 
| ' 
| 
22 bb 
1 
1 
1 ° ° ° 
° 1 ° Q 
° ° 1 ° 
1 
1 
Fa — o °  ¢ ° ° 
28 bb 
1 
Pos Fu Lee 
> 
Faa Fra | ° 


10 PAUL HORST 


But, having obtained the diagonal ele- 
ments of F’, we need no longer be inter- 
ested in the right side of the equation, and 
may now use the left side, only, to solve 
for and enter the remaining unknown ele- 
ments of FP’. This we may readily accom- 


Baal’ art =o 


plish provided that, for any row of F’, we 
solve successively for the elements in- 
creasingly to the left of the diagonal. 

3. The rows of F’ in Equation 37 are 
checked by means of the equations 


| 


Baal act bet hack’ 


Baal oat hark bat hack cat | 


Or if we let 
e’=(1, 


we can write Equation 41 in matrix nota- 
tion as 
[42] 
4. Once the F’ matrix has been solved 
for, we are ready to compute the regres- 
sion vectors for the various criteria. First, 
we make up a matrix / such that the first 
row is the ath row of ,7, i.e., the column 
of ,T’ corresponding to the first selected 
predictor. The second row is the bth row 
of »T. In general the ith row of / is the wth 
row of ,T. We have therefore, 


iT 2 iT a3 
aT Py 4 


Figs. = 


tai las tas 
lor toe * 


ter tea tes 


5. To get the regression vector 8, for 
the kth criterion we use the F matrix and 
the &th column of ¢ in Equation 43. In 
scalar notation we have 


Ber = F 
etc. 

If we let 8 be the matrix of regression 
vectors in which the rows represent se- 
lected predictors and the columns repre- 
sent criteria, we can write equations in 44 
in matrix notation as 


B=Fi. 
C. Solving for the Multiple Correlations 


Ordinarily it will be of considerable in- 
terest to know the multiple correlation of 
each criterion with the selected predictors. 

1. First we make up a matrix from the 
rows of the ,T matrix corresponding to the 
selected predictors. The rows are written 
in the order in which the variables were 
selected and the matrix is designated ,7\;). 
We then compute the squares of the mul- 
tiple correlations by means of the conven- 
tional scalar formula, 


R= 3 FicBic 


[aa] 


[46] 


where the r;, are the correlations of the 
selected predictors with a given criterion 
and the 8; are the corresponding beta 
weights. 

If, as before, we consider only the i se- 
lected predictors, and let 


iT’, be the &th column of 1:7, 


DIFFERENTIAL PREDICTION I! 


8.=the column vector of beta weights 
for the criterion k, 

R,=the multiple correlation of cri- 
terion k with the predictors, 


we have in matrix notation 


Re=iT [47] 


2. As a check, the R,’’s are also com- 
puted independently by means of the 
scalar formula 

oT 


Ri? = + + 


28 bb 


Py 


[48] 


III. A NUMERICAL ILLUSTRATION 
A. The Data 


We shall now illustrate the methods by 
a numerical example. The sample con- 
sists of 2,243 entering freshmen at the 
University of Washington for the fall of 
1947. There are eight predictor variables 
as follows: average high school grades in 
(1) English, (2) mathematics, (3) foreign 
language, (4) social science, (5) natural 
science, (6) electives, and the Q and L 
scores on the American Council Examina- 
tion. 

The criterion measures are the grade 
point averages for ten different college 
course areas. These subjects together with 
the number of cases for each course are: 
anthropology (603), chemistry (752), eco- 


nomics (1,125), English (2,005), foreign 
language (619), geology (475), history 
(583), mathematics (848), psychology 
(1,255), and zoology (469). 


B. Computations for Selecting the 
Predictors 


The illustrations of the computational 
procedures will follow the same number- 
ing as the steps in Part II, Section A. 

Table 1 gives the matrix of intercorrela- 
tions of the predictor variables from which 
the subset for the differential prediction 
battery is to be selected. For purposes of 
illustration the selection process is carried 
out until all variables have been selected. 
This matrix is the one which we have 
designated in Part IT as ig. 

Note that for checking purposes the 
matrix has been summed by rows and 
columns and a grand total is entered in 
the lower right-hand corner. 

Table 2 is the transposed form of the 
matrix of validity coefficients which we 
have called ,7. 

1. The first row at the bottom of Table 
2 consists of column summations. The a 
row consists of sums of squares of column 
elements. These are the values given by 
the term 2,7;? in Equation 1 of Part IT, 

Row 6b is obtained by squaring the cor- 
responding element in the summation 


TABLE 1 
THE ig Matrix OF Prepictor INTERCORRELATIONS 


8 


-1144 
-347° 


- 347° 
-2545 
2983 
.1168 
.4160 
1.0000 


4.4528 
4.2485 
4.1943 
4.2929 
4.2702 
3-3455 
2.5137 
3-0741 


4.4528 


4.2702 


3-3455 2.5137 3-0741 39.3920 


30.3920 


I 2 3 4 5 6 7 | z 

1.0000 -6437 7193 .5918 -4719 -1144 
-5647 ‘1.0000 5893 +5366 . 5801 +3737 3496 
+7193 5366 -§679 1.0000 - 5936 1474 
.5918 . 5801 5490 1.0000 .4372 . 2000 
-4719 -3737 4003 4372 ‘1.0000 1495 
- 3496 -1458 -1474 . 2000 1.0000 
| +2545 2983 -3230 .3185 .1168 .4160 

MM 42020 


PAUL HORST 


TABLE 2 
Tue Matrix or Vatipiry Coerricrents 


Predictors 


Total Check 


OS 


.6856 
3086 
.0158 


row. The ¢ row is obtained by dividing 
each element in the b row by N or 10, the 
number of criterion variables. These are 
the values given by the term (2,7;)?/N in 
Equation 1 of Part II. The d row is ob- 
tained by subtracting each element in the 
¢ row from its corresponding element in 
the a row. These elements are the ,A, 
values given by Equations 1 and 2 of 
Part II. 

The largest value in the d row is .0479 
for variable 2. This variable, therefore, be- 


9.8242 
95.8631 
9.5862 
.2380 


9. 5863 
. 2380 


comes the first selected variable, namely, 
variable a. 

2. The next step is to compute the 2 
matrix shown in Table 3. This is com- 
puted by Equations 3 or 4 of Part II. A 
convenient procedure is to have each g 
table on a separate worksheet. To get og 
from ig, the ath column may be copied 
on a separate strip and laid alongside the 
ig column for which the corresponding 2 
column is to be computed. We may illus- 
trate the use of Equations 3 and 4 of 


TABLE 3 


THE og MATRIX 


6 


OM Sw 


. 2609 
. 1801 
. 2046 
2204 
. 8604 
.0099 
.0217 


Check 


° 


2.0537 


2.0132 


1.7578 


z 2.0536 


2.0132 


1.7580 


12 
1 2 3 4 5 6 7 8 
4378 +3357 +4793 -395° +2053 
-4716 +3737 3876 .4906 .2855 .2958 . 2891 
. 4028 .4200 3645 -449° -4329 . 2106 .2727 -3721 
48065 3674 -4045 -4410 .4184 2848 -2055 -4427 
3852 -4400 -3793 3855 .2794 . 1288 2419 
3383 -3349 -3216 -3701 3836 .1972 . 1963 -3185 
-3440 -2335 «3208 3851 . 2870 .1§72 .0952 . 2899 
-4257 -3790 .2928 -4156 . 2673 . 2998 2265 
3029 .3228 «3969S «3799 1992 -1978 3387 
1 -4071 -449° -3531 -4119 -4656 +2375 +2033 .4210 
Zz 3-8315 3.7259 3-6094 3.9930 4.0601 2.3300 2.1809 3.3168 
a 1.4931 saat I 1.6181 1.6670 . 5609 -5176 1.1470 
b 14.6804 13.8823 13 15.9440 16.4844 5.4289 4.7563 11.0012 
1.4680 1.3882 I 1.5944 1.6484 .5429 -4750 1.1001 
d .0186 .0180 .0469 
1 a 3 4 5 ' 7 8 
.4163 . 2641 — .0830 2033 
° ° ° ° 
-6527 .2517 . 2072 — .0602 1483 
.2517 .7121 . 2823 — .0402 . 1864 
.2072 . 2823 — .0028 . 1709 
. 1801 . 2046 .2204 .0099 .0217 
—.0602 —.0402 —.0028 .8778 «3270 
-1483 . 1864 . 1709 .3270 -9352 
° 1.6907 461.8056 1.0285 1.9928 


DIFFERENTIAL PREDICTION 13 


Part II for getting the first column of 24 
as follows: 


181 182 1821 281 
I 1.0000 -5047 X .5647 
2 .§647 1.0000 - 5647 
3.6437 X .5647 
4 «7193 X .5647 
5 .5918 .5801 X .5647 
6.4719 3737 X .5647 
7 «1144 -3490 X .5647 
8 2545 X .5647 
z 


4.2485 X .5647 


2.0537 Check 
2.0536 


4.4528 


Notice that the equations also are ap- 
plied to the summation elements of jg. 
The resulting values are entered in the 
appropriate column below the 2g matrix 
as the ‘‘check” items. The summation ele- 
ment for each column of 9g is simply the 
total of all entries above the “‘check”’ ele- 
ment, and should agree with the latter 
within the limits of rounding errors. The 
computations of successive columns of 2 
are made in the same way as the first ex- 
cept that the appropriate column of 1g is 
used instead of the first, and the constant 


multiplier for the elements of the ig. is 
the appropriate element from the ig. row 


instead of the first. 
3. Next the 27 matrix is computed by 
means of Equation 5 or 6 of Part II from 
the ,7 matrix, its ath or second row and 
the ath or second row (or column) of the 
ig matrix. Table 4 is the 27 matrix in 
transposed form. As in the case of the g 
matrices, it is well to have each T matrix 
on a separate worksheet. To get .7’ from 
iT’ the ath or second column of ,7’ may 
be copied on a separate strip and laid 
alongside of the ,7’ column for which the 
corresponding 27’ column is to be com- 
puted. We illustrate the use of Equations 
5 and 6 of Part II for getting the first 
column of 27” as follows: 
18% 
5647 
- 5047 
«5647 
-§047 
5647 
- $647 
5647 
- 
$647 
§647 


— 
x 


OS 
KKK KK XK 


~ 
a 
x 


5647 


TABLE 4 


Tue Matrix 


OS S WwW DH 


° 


° 


II 
63 
3° 
33 
1.72762 
I 2 3 4 5 6 7 8 Total Check 
+2210 2992 2003 .0799 1083 .2910 
.0958 -1345 . 2230 1093 . 1309 1691 
.1170 . 1893 .0590 .1259 
. 1880 2439 -2053 -1475 .0771 -3492 
2130 -1726 .1621 -1355 —.0059 -1439 
.1242 1904 . 1893 .0720 .0792 2333 
. 1832 2598 .1516 .0700 .0136 2305 
.1287 .0044 1687 . 1082 .1510 . 1182 
+1443 +2344 2042 .0860 .O919 2616 
I = .0885 -1710 2051 .0697 1063 3067 
Check 1.7275 1.5037 1.9937 1.8987 9376 .8783 2.3686 
z= 1.7276 1.5037 1.9938 1.8989 -9377 .8783 3687 
a «3362 . 2466 -4390 .3651 .0960 .0997 .6103 2.1929 
b 2.9846 2.2611 3.9752 3.6058 .8793 -7714 5.6107 20.0881 
2985 -3975 3606 .0879 .0771 -5611 2.0088 2.0088 
d -0377 .0205 -O415 -0045 .0226 .0492 . 1841 1841 
e -0554 +0314 .0583 .0068 .0094 .0257 .0526 


14 PAUL HORST 


The checking procedure is precisely the 
rame as for the »g matrix. The remaining 
columns of the ,7’ matrix are obtained in 
the same way by using the appropriate 
columns of the ,7” matrix and the ap- 
propriate element from the ath or second 
row of the ,g matrix. 

4. The rows at the bottom of the .7’ 
matrix are not quite the same as for the 
:T’ matrix. The first row at the bottom is 
labeled “‘check.” This row, as previously 
indicated, is obtained by applying Equa- 
tion 5 of Part II to the = row of ,7’. The 
summation row consists simply of column 
sums and should, within rounding errors, 
be the same as the “check” row immedi- 
ately above it. Rows a, 6, c, and d are ob- 
tained in precisely the same way as the 
corresponding rows for 

Row d is the value 25S,” given by Equa- 
tion 7 or 8 of Part II. However, an addi- 
tional row ¢ is required in this and subse- 
quent 7” matrices. 

5. Row e¢ is obtained by dividing each 
element in row d by the corresponding di- 
agonal element of the 2g matrix. These are 
the »A, values given by Equation 9 or 10 
of Part II. For example, for the first ele- 
ment in row ¢ we have 


It is of interest to note that, in the case 
of the ,7’ table, we would also have an e 


row which would be obtained by dividing 
each element in its d row by the corre- 
sponding diagonal element of the ig 
matrix. However, the diagonal elements of 
the ,g matrix are all unity; therefore, the 
e row for the ,7’ matrix would be identical 
to its d row. 

The highest value in the e row of Table 
4 is .0583 for variable 4. Therefore, the 
second selected variable or differential 
predictor b is variable 4, Using Equation 
31 of Part II, we have as our index of dif- 
ferential prediction efficiency for the first 
two selected variables 


.0479+ .0583 = . 1062. 


6. Each element of the bth or 4th row 
of og in Table 3 is now divided by the 4th 
diagonal element, as described below, to 
get the .G values given by Equation 11 or 
12 of Part II and the results are entered in 
the second row of Table 5. The first row is 
simply the ath or second row of the ig 
matrix of Table 1. Since the reciprocal of 
the 4th diagonal element of the 2g matrix 
is required in the computation of the F 
matrix, it is computed first and entered 
to the left of the row. Then each element 
of the 4th row of og is multiplied by it to 
get the other entries in the row. The check 
sum and the ath or 4th element are both 
included. The 4th entry in the second row 
of Table 5 should be unity. This checks 
the computation of the reciprocal. The 


TABLE 5 
Matrix or ;G,’ Vectors wiTH RECIPROCALS OF THE DIAGONAL ELEMENTS FROM THE 


CORRESPONDING if VECTORS 


40429714 
. 12815884 
33868808 
82448458 
.91058463 
70197243 
1. 395860823 


.0377 
2A, = = .0554. 
6811 
1/‘8ww 1 2 3 4 5 6 7 8 Check 2 

1,00000000 «. 547 -3737. -3496 .2545 4.2485 4.2485 
. 5846 ° -3535 1.0000 .3964 .2873 —.0564 .2618 2.8271 2.8272 
. 1064 ° .0928 ° .1094 —.0360 .3808 1.0000 1.6538 1.6534 
—.1277 ° — .1035 ° —.0319 .0448 1.0000 ° .7814 7817 
2646 ° 1.0000 ° .1750 =. 2085 ° ° 1.6482 1.6481 
° ° ° 1.0000 2367 ° ° 1.3521 1.3521 
1.0000 ° ° ° ° 2824 ° ° 1.2824 1.2824 
° ° ° ° ° 1.0000 ° ° 1.0000 1.0000 


DIFFERENTIAL PREDICTION 


TABLE 6 
THE sg Matrix 


4 


cooooooo 


+0595 


° 


° 


total of the entries, exclusive of the re- 
ciprocal and the “check” is entered at the 
extreme right of the row. This should be 
the same as the “check” entry immedi- 
ately to its left. 

7. The sg matrix is now computed from 
the og matrix of step 2, its bth or 4th col- 
umn and the 2G,’ vector of step 6. The 
procedure is precisely the same as for step 
2 except that the constant multiplier for 
a given column is taken from the ap- 
propriate element of the .G,’ vector in the 
second row of Table 5. The sg matrix is 
given in Table 6. 


1419 


8. Next the ;7’ matrix is computed 
from the 27” matrix of step 3, its dth or 
4th column, and the »G,’ vector of step 6. 
The procedure is the same as for step 3 
except that the constant multiplier for a 
given column is taken from the corre- 
sponding element of the .G4’ vector in the 
second row of Table 5. Table 7 gives the 
37’ matrix. 

9. The rows a through d of the 57” 
matrix in Table 7 are computed just as in 
step 4. 

10. Rowe of 37” is obtained in the same 
way as for ,7” except that now each ele- 


TABLE 7 


Tue Marrix 


Check 


7 Total 


Ooo OM 


.1252 
1385 
-0909 
.0899 
.0282 
.1540 


° 
° 


° 


3701 
3.4110 
.0350 
-0395 


15 
I | 3 | 5 6 7 8 
I -4377 -1413 +0943 
2 ° ° ° ° 
3 1637 . 1078 
4 ° ° ° 
5 »§510 +1393 -0970 
6 -1413 -1393 — .0319 
7 —.0§05 .O131 .O214 +3375 
8 — .0319 8864 
Check —_.8768 -979° ° 1.0076 1.1794 1.4659 
.8766 .9789 ° 1.0075 1.1795 1.4656 
I 2 3 5 6 
.0817. —.0061 .2127 
. 1697 .0707 - 1339 
.1007. — .0046 . 2067 
. 1086 -0774 2854 
—--0859 .0987 
“1138 .O173 1835 
—.0046 .1625 
1432 .0897 1014 
-1113 .0187 1051 2002 
I -1373 .0206 -1159 . 2619 
1084 3648 .9907 1.8466 
1. 1086 . 3050 .9906 1. 8469 
a .0428 ° .0779 ° .0276 1196 
b 3150 ° 6384 ° 1.2290 9813 6.7085 
.0316 ° .0638 ° . 1229 .0133 .0981 .6708 .6709 
d -O112 ° ° .0215 1064 
e .0256 ° .0250 .0187 .0178 .0245 


PAUL HORST 


TABLE 8 


| 
| 


OMS 
| 


.7207 


ment of row d is divided by the corre- 
sponding diagonal element of the 3g ma- 
trix of Table 6, The largest value in row e 
of Table 7 is .0395 for variable 8. There- 
fore, the third or cth differential predictor 
is variable 8. By means of Equation 31 of 
Part II we have as our index of differential 
prediction efficiency for the first three se- 
lected variables 


3A, = . 1002+ .0395=.1457. 


11. The repetition of steps 6 through 
10 for the selection of the fourth predictor 
is illustrated by the numerical example as 
follows: 


Tue «g Matrix 


° 
.0984 


. 5410 

.1428 

— .0238 
° 


.8472 


.8472 1.2323 


a. The 3G,’ vector in the third row of 
Table 5 is obtained by Equation 21 or 26 
of Part II from the cth or 8th row of the 
ag matrix of Table 6 and its 8th diagonal 
element. 

b. The «g matrix of Table 8 is obtained 
by means of Equation 22 or 27 of Part IT, 
from the 3g matrix of Table 6, its cth or 
8th column, and the third row of Table 5. 

c. The 47’ matrix of Table 9 is ob- 
tained by means of Equation 23 or 28 of 
Part IT, from the 37” matrix of Table 7, its 
cth or 8th column, and the third row of 
Table 5. 

d. The 4S," values in row d of Table 9 


TABLE 9 
Tue Matrix 


| 


| 
00 


ie) 


Total Check 


00 


16 
I 2 3 4 5 6 7 8 
-1549 .1447 — .0954 ° 
° ° ° ° 
1108 — .0773 ° 
° ° ° ° 
.0984 — .0238 ° 
.1108 8005 +0335 ° 
— .0773 -0335 - 747° ° 
° ° ° ° 
Check . 7208 ° . 8430 ° 1.2322 ° 
1 5 6 7 
.0955 ° .0584 .OO15 .0442 
.0188 ° .0781 .0028 .0598 
.0753 ° .O774 .0877. —.0178 
° .0829 — .0338 
.0399 ° .0937 .0200 
.0703 ° .0308 0012) — .0337 
.0905 ° .1321 .0933 .1160 
.0428 ° .0894 .0259 .0289 
I .0038 ° 1087 .0300 .0162 
Check ° ° .2875 ° 
° ° .go66 -4313 . 2873 ° 
a .0228 ° .0554 ° .0934 .0323 .0307 ° . 2346 
.1334 ° 3939 ° . 1860 .0825 ° 1.6177 
° .0304 ° .0822 .0186 .0082 ° 1617 
d .0095 ° .0160 ° .O112 .0225 ° .0729 .0729 
.0222 ° .0288 ° .0207 .0301 ° 


DIFFERENTIAL PREDICTION 


TABLE 
THE 5g Maraix 


4 


| 
| 
| 
| 
| 
| 


OU NH 


° 
| | 


| 


©0000000 
| 


° 
° 


TABLE 11 
Tue 57’ Matrix 


Total Check 


4 


x 


I 
2 
3 
4 
5 
6 
7 
8 
9 
° 


| 


° 


TABLE 12 
THE 6g MATRIX 


eoooo0000 


° 


° 


17 
1 3 5 7 8 
.0858 . 1490 
° ° 
-0959 -1143 
° ° 
5402 -1439 
-1439 7990 
° ° 
° ° 
Check 
z= -7953 
I 2 3 5 6 
.0563 
.0009 
.0205 
. 1037 
0633 
.0210 
.0387 
0399 
.0299 
I .0277 
Check .4022 ° -6573 .9156 .4184 ° 
. 4019 .g156 .4184 ° 
a .0235 ° .0584 .0958 .0310 ° . 2087 
b ° .1750 ° ° 1.6070 
.0162 ° -0432 .0838 .O175 ° 1607 1607 
d .0073 ° .O152 .0120 .0135 ° ° .0480 .0480 
.O175 ° .0277 .0222 .0169 ° ° 
I 
I 
2 ° 
3 ° 
4 
5 
6 .1188 
7 ° 
8 ° 
Check 5563 
z +5563 


15 PAUL HORST 
TABLE 13 
Tue 67” Matrix 

I 2 3 4 6 7 8 Total Check 
1 .0298 ° ° ° —.0214 ° ° 
2 — .O110 ° ° 1501 .0622 ° 
3 ° -0756 —.0051 ° ° 
4 .0843 ° ° ° .0639 -0732 ° ° 
5 ° ° ° -0574 ° 
6 ° ° ° -O142 ° ° 
7 ° ° ° —.0125 ° ° 
° ° ° .1168 ° ° 
9 0178 ° ° ° .0823 ° 
10 0263 ° ° ° 1082 .0282 ° ° 
Check . 2283 ° ° ° ° ° 
° ° ° 8005 2814 ° ° 

a .0106 ° ° ° .0774 .O192 ° ° 1072 
b .0520 ° ° ° .6408 .0792 ° ° -7720 

° ° .0079 ° ° 0772 -0772 

d ° ° .0133 .O113 ° ° 0300 0300 
e -O143 ° ° -0254 .01406 ° ° 


are obtained by means of Equation 24 or 
29 of Part II from the rows above it. 

e. The 4d, values in the e row of Table 9 
are obtained by Equation 25 or 30 of 
Part II, from the d row of Table 9 and the 
diagonal elements of Table 8. 

For illustrative purposes the procedure 
has been carried out until all the original 
eight predictor variables were exhausted. 
The remaining rows of Table 5 give the 
successive ,G,’ vectors in the order in 
which the variables were selected. Tables 
10 through 17 give alternately the remain- 
ing g and 7” matrices. 

For convenient reference the successive 


TABLE 14 
THe 7g MATRIX 


1 2 
I 3701 0 ° ° © .1045 Oo ° 
2 ° ° ° ° ° ° ° ° 
3 ° ° ° ° ° ° ° ° 
4 ° ° ° ° ° ° ° ° 
5 ° ° ° ° ° ° ° ° 
6 -1045 © ° ° © .7459 0° ° 
7 ° ° ° ° ° ° ° ° 
8 ° ° ° ° ° ° ° ° 
Check .4746 0 © .8502 ° 
4740 0 ° ° .8504 ° 


| 
| 
i 


indices of differential prediction efficiency 


are summarized below. 

= = .0479 

= + = .0479 + .0583 = .1062 
od = do + As = .1062 + .0395 = .1457 
= + = .1457 + .og01 = .1758 
ds = od + 5A; = .1758 + .0277 = .2035 
ds = bs + ods = .2035 + .0254 = .2289 
= + = .2289 = . 2465 
os = or + she == .2465 + .0114 = .2579 


C. Computations for the Regression 
Weights 

In solving for the regression vectors we 
shall assume that only the first four se- 
lected predictors are to be used. The suc- 
cessive steps to be followed are numbered 
to correspond to the steps in Part IT, Sec- 
tion B. 

1. We first prepare the L matrix of 
Equation 35, Part II, from the appropri- 
ate entries in Table 5. The first row of 
Table 18 is obtained from the first row of 
Tabie 5. The first entry in the first row of 
the L matrix is zero, corresponding to the 
1.000 entry in the first row of Table s. 
The remaining elements correspond to all 
but the first selected predictor. They are 
copied with opposite sign in the order of 
selection from the first row of Table 5. 


DIFFERENTIAL PREDICTION 


TABLE 15 
THe 77” Marrix 


Total Check 


| 
} 


| + 


9 
° 
° 
° 


° 
° 


| 


oooo0000000 


| 


eoooo°o 


This row is indicated in Equation 32 of 
Part II. The first two entries in the second 
row are zero. The remaining elements cor- 
respond to all but the first two selected 
predictors, copied with opposite sign in 
the order of selection from the second row 
of Table 5. This row is indicated by Equa- 
tion 33 of Part II. The remaining rows 
are similarly obtained from the corre- 
sponding rows of Table 5. 

2. Next we prepare the /’ matrix of 
Equation 37 of Part II as shown in Table 
19. The diagonal elements of this matrix 
are copied from the left-hand column of 
Table 5. These are the elements indicated 
by Equations 36 of Part II. To get the first 
element in the second row of Table 19 we 
use Equation 38 of Part IT. This gives 


—-7535= —-§306X 1.4043. 


The second and first elements in the third 
row of Table 19 are, respectively, by 
Equations 39 of Part II 


— .2954 = (—.2618)1.1282, 
— .1286 = (—.5366)(—.2954) 
—.2545(1.1282). 


Similarly by Equations 40 of Part II, the 
third, second, and first elements of the 
fourth row of Table 19 are 


— .§098 = (—.3808) 1.3387, 
.2090 = (— .2618)(—.5098) 
+ .0564(1.3387), 
— .4504 = (—.5366).2090— .2545( —.5098) 
— (.3496) 1.3387. 


The use of matrices in solving for the 
elements of F’ is indicated briefly in step 
2, Part II, Section B. 


TABLE 16 


THE sg Matrix 


I 2 3 


|] 


5 


= 


~ 


On 


a 


19 
I 6 8 
— .0314 
02607 
— .0230 
.O581 
.0484 
— .0064 
— .0165 
+0379 
— .0044 
I .0026 
-1357 ° .0920 ° 
a .0083 ° .0097 ° ° .o180 
b .0184 ° ° .0269 
.0018 ° .0008 ° ° .0026 .0027 
d .0065 ° .0085 ° ° .O1§50 .O154 
.0176 ° -O114 ° ° 
Zz ° ° ° ° o .7164 0 ° 


PAUL HORST 


TABLE 17 


5 


w | 


Tue 57’ Matrix 


Total 


Check 


0 
0 
0 


° 
° 


° 


| 


i) 
° 


° 
° 


TABLE 18 
Tue L Matrix 


—.2545 —.3496 

— 
° — .3808 
° ° 


3. To check the F’ matrix we multiply 
the elements in a given row by the cor- 
responding elements from the first row of 
Table 5 and sum the products. These sums 
should be zero for all but the first, which 
obviously must be 1. The check for the 
final row of the F’ matrix as indicated by 
Equations 41 of Part II is 


1.0000 (—.4504) +.5366(.2090) 
+.2545(—.5098) +.3496(1.3387) =o. 


oooo°o 


The entries in the check column to the 
right of Table 19 are these product sum- 
mations and should all vanish except for 
rounding errors. The extreme right-hand 
column of Table 19 consists of row sum- 
mations exclusive of the check column. 
4. The rows of the ¢ matrix, given by 
Equation 43 of Part II, are taken from 
the appropriate columns of the ,7” 
matrices. This matrix is shown in Table 
20. The first row is the ath or second col- 
umn of the ,7’ matrix of Table 2. The 
second row is the bth or 4th column of the 
2T’ matrix of Table 4. The remaining rows 
are similarly obtained from the appropri- 
ate columns of the successive ;7’ matrices. 
5. By means of Equation 44 of Part II 
the matrix of regression vectors 8 in Table 
21 is obtained from Tables 19 and 20. The 


TABLE 19 
Tue F’ Marterix 


20 
I — .0384 
2 .0347 
3 — .0245 
4 0304 
5 .0428 
6 — 
7 —.0214 
8 .0385 
9 — .0067 
10 — .0013 
Check ° ° -0535 
° ° ° ° .0537 
a ° ° .0085 ° .0085 
b ° ° ° ° .0029 ° .0029 
° ° ° ° .0003 ° .0003 
d ° ° ° ° .0082 ° .0082 .0082 
° ° ° ° .O114 ° 
(a) (b) (c) (d) 
2 4 8 7 
(a) 2 ° — .5366 
(b) 4 
(c) 8 ° ° 
(d) 7 ° ° 
(a) (b) (c) (d) Check* > 
2 4 8 7 
(a) 2 1.0000 ° ° ° 1.0000 1.0000 
(b) 4 — .7535 1.4043 ° ° .0000 .6508 
(c) 8 — .1286 — .2954 1.1282 ° .0000 7042 
(d) 7 — .4504 . 2090 — .5098 1.3387 5875 
* See text. 


DIFFERENTIAL PREDICTION 


TABLE 20 
THe : Matrix 


1 2 3 4 


6 7 10 2 


- 3674 
.0875 


+3357 
2992 
.2127 


-4200 
.2236 2439 

.2854_.. 
.0598 —.0178 —.0338 


+3349 
+1904 
1835 
.0200 — .0337 


+2335 
. 2598 
1625 


.1710 
. 2619 
.0162 


3.7259 
1.9938 
1.8469 

. 2873 


first column of Table 21 is obtained from 
Table 19 and the first column of Table 20 
as follows: 

The first element of the column is the 
sum of products of corresponding ele- 
ments in the first column of Table 19 and 
the first column of Table 20. The com- 
putations for the first element are 


.0630 = 1.0000(.3357) —.7535(-2992) 
— .1286(.2127) —.4504(.0442). 


The second element in the column is the 
sum of products of corresponding ele- 


The elements in the second column of 
Table 21 are obtained in the same way as 
the first except now the second column of 
Table 20 is used instead of the first. Each 
remaining column of Table 21 is obtained 
by using Table 19 and the corresponding 
column of Table 20. 


D. Computations for the Multiple 


Correlations 


1. The conventional method for solving 
for the squares of the multiple correla- 
tions is indicated by Equation 46 or 47 


TABLE a1 
THe Matrix 


. 1980 
2654 
2027 
.o801 


-1549 
-3311 
— .0238 


-3119 .0875 
.0847 .2761 


-2792 
. 1662 
. 2872 


.0268 -0217 


7462 .7167 


5998 .6072 .6134 +7542 


-7167 


7462 


ments in the second column of Table 19 
and the first column of Table 20. Each 
succeeding element is the sum of elements 
in the corresponding column of Table 19 
and the first column of Table 20. 

The element in the check position for 
the first column is the product of corre- 
sponding elements in the 2 column at the 
right of Table 19 and the first column of 
Table 20. The element in the ¥ position 
is the sum of all elements in the column 
exclusive of the check entry. The check 
and sum entries should be the same within 
rounding error. 


5998 .6072 


6134 +7543 


of Part II. For computational convenience 
it is well first to copy in row form the 
columns from Table 2 corresponding to 
the selected predictors. These should be 
copied in the order selected as shown in 
Table 22 and are designated the ;7T 
matrix, The multiple correlations are then 
calculated and entered in the first row of 
Table 23. The first entry in this row is the 
sum of products of corresponding ele- 
ments in the first columns of Tables 21 
and 22. These computations are 


.2921 = .3357(.0630) +.4793(. 3666) 
+.3764(.2174) +.2257(.0592). 


21 
5 
2 -4257  .3029 
4 
8 -1014 .2002 
7 -1160 ~=.0289 
I 2 3 4 5 6 7 8 9 10 
2 .0630 3136 -2577 .1588 .0320 
4 3606 .1676 2062 -2174 . 3098 
8 .2174 1065 .1286 .1968 2005 
7 .0592 — .0452 
Check .7062 7048 +5471 


PAUL HORST 


TABLE 22 
THE Matrix 


3 4 


8 


3674 
-4410 
-4427 
+2055 


+3357 
4793 
3794 
- 2257 


.4200 
-449° 
-3721 
.2727 


The second entry in the row is the sum 
of products of corresponding elements in 
the second columns of Tables 21 and 22. 
The remaining entries in the row are com- 
puted in the same way by using cor- 


-4257 
. 2928 
. 2265 
2998 


-4499 


-4119 3.9930 
-4210 3.3168 
.2633 2.1809 


element of Table 19. The first element in 
the second row of Table 23 is the sum of 
products of corresponding elements of the 
first columns of Tables 20 and 24. The re- 
maining elements are the sums of products 


TABLE 23 


Tue Matrix 


3 5 


7 


. 2028 
. 2027 


. 2996 . 3108 
. 2996 .3108 


. 2783 


. 2167 
2167 


. 1806 
. 1806 


.2152 


3204 


responding columns from Tables 21 and 
22. 

2. The second row of Table 23 may be 
regarded as a check on the first or vice 
versa since the two rows are obtained by 
independent methods. Its computation is 
indicated by Equation 48 in Part II. To 
get the second row of Table 23 we first 
compute Table 24 from Table 20 and the 
diagonal elements of Table 19. The first 
row of Table 24 is the same as the first 
row of Table 20. The second row of Table 
24 is obtained by multiplying each ele- 
ment in the second row of Table 20 by the 
second diagonal element of Table 19. Each 
remaining row is obtained by multiplying 
each element of the corresponding row of 
Table 20 by the corresponding diagonal 


of corresponding elements of correspond- 
ing columns of Tables 20 and 24. 


IV. MATHEMATICAL DERIVATIONS 
A. The Index of Differential Prediction 
Efficiency 
The index of differential prediction 
efficiency employed in the methods out- 
lined in the previous sections was based 
on the assumption that we wish to obtain 
from the predictors the best estimates in 
the least-square sense for the difference 
scores derived from all possible pairs of 
criterion scores. We assume standard 
measures for both predictor and criterion 
scores. We let 
M =number of cases, 
n=number of predictors, 


TABLE 24 


Tue D;t Matrix 


3 4 


z 


.4200 
.3140 2424 
.2332 
.o801 —.0238 —.0452 


3674 


— .O451 


3.7259 
2.7999 
2.0837 


0.3846 


3-7259 
2.7 
2.083 


0. 3848 


22 
1 2 P| 5 6 7 i 9 10 z 
2 -3852 2335 30200 
4 3876 +3793 «3701 3851 - 3909 
& 2891 .2419 .3185 .2899 3387 
7 . 2958 .1288 .1963 .0952 .1978 
I 2 6 8 9 10 
1 2 5 6 7 8 9 10 Check 
2 3357-4716 -3349 -2335 -4257 -30290 .4490 
4 .4202 .1889 .2074 .3648 .0904 .3292 .2401 
8 .2400 -1144 .2259 
7 0592. 1171 .0387 


DIFFERENTIAL PREDICTION 23 


N =number of criteria, 

X=the (MXn) matrix of predictor 
measures, 

Y=the (MXN) matrix of criterion 
measures, 

H=the (MXN*) matrix consisting of 
difference vectors for all possible 
pairs of criterion vectors i and j, 
including 

Z=the (MXN) matrix of best least- 
square estimates of Y obtained 
from X, 

K =the (MXN?) matrix of best least- 
square estimates of H obtained 
from X, 

B=the (nXN) matrix of least-square 
regression vectors for estimating Y 
from X, 

B=the (nXN*) matrix of least- 
square regression vectors for esti- 
mating H from X, 

ig=the (mXm) matrix of intercorrela- 
tions of the predictors, 

iT =the (nXN) matrix of validity co- 
efficients, i.e., the correlations of 
the predictors with the criteria, 

r=the (N XN) matrix of intercorrela- 
tions of the criteria, 

C=the (VXN) matrix of covariances 
of the predicted criterion measures. 


We define square matrices of order N of 
the form 


[x] 


where e; is a column vector with all ele- 
ments zero but the ith which is unity, and 
1’ is a row vector with all unit elements. 
We also define the supermatrix G’ of order 
(N XN?) by 


G’=(F,, F2, Fy). 


[2] 
Because of the above definitions 
X'X 


ls] 


[6] 


[7] 
[8] 
We now consider the residual matrix 
E=H-K. lo| 
It is the trace of E’E that we wish to 
minimize. Substituting 8 in 9 we have 
E=H—XB. [10] 
The solution for B in 1o that minimizes 
the trace of E’E is well known to be 
B=(X'X) 


From 1 and 2 and the definition of H we 
have 
H=YG". [x2] 
Substituting 12 in 11, 
[13] 
Substituting 3 and 4 in 13, 
B= TG’. [14] 


Next consider the residual matrix 


e=VY-Z. 


[15] 


Substituting 7 in 15, 
e=VY—XfB. 


[16] 


Determining 8 so as to minimize the trace 
of e’e we have 
B=(X'X)"X’Y. [17] 


Substituting 3 and 4 in 17, 


B= iT. [18] 


iT, 
M 
= 
M 
ZZ 
= 
| | 


24 PAUL HORST 


Substituting 18 in 14 gives 
B=pG". 


Because of 1 and 2, 19 shows that with a 
given set of predictors the least-square 
regression vector for estimating the differ- 
ence between any two criteria is given by the 
difference between the two least-square re- 
gression vectors for estimating the crileria 
separately. Equation 1g generalizes the re- 
sults given by Mollenkopf (7) and Thorn- 
dike (9). 

We require now the trace of E’E. First 
we get from ro and 11 


E=|[—X(X'X) 


[19] 


[20] 
From 20 it can be readily proved that 
Substituting 12 in 21, 
Substituting 3, 4, and 5 in 22, 
F'E= MG |r—,T" iT |G’. [23] 
Now from 7 and 18, 
1T=Z. [24| 
From 3 and 24, 


ZZ 
glial. [25] 


From 6 and 25, 
[26] 
Substituting 26 in 23, 
E = MG(r—C)c". [27] 


Let 


yo=tr [28] 


From 27 and 28, 


(G(r—C)G’]. [29] 


From 29 it can be shown that 
v=tr [(r—C)GG]. [30] 


But from 1 and 2 it can be shown that 
12’ 
GG=2N| 
From 30 and 31, 


[32] 


[33] 


[34] 


( 
tr Cc - = 
N N 


Furthermore, if we let D, and D, be the 
diagonal matrices of the diagonals of r 
and C respectively, we can because of 34 
and 35 rewrite 33 as 


) 
N 


From 36 we may write 


1’D,1 '(r—D, 
2N N N(N-1) 


Suppose now we let 


[35] 


7) | 
2N N 
or 
( 
—= ( tr r—tr —— 
2N N 
crf 
(treme ); | 
but 
tr{r 
( N = 
and 
= 
N(N-1 37 


DIFFERENTIAL PREDICTION 25 


Fe» be the average intercorrelation of 
the criteria, 

Cx be the average variance of the esti- 
mated criterion measures, 

Crp be the average covariance of the 
estimated criterion measures. 


Then 
_1(r—D,)1 
N(N-1) 
'DA . 
= 
kk N 
1'(C—D,)1 


= N(N-1) [40] 


[38] 


Trp 


[39] 


1’D,1 
=1. 
NV 


Substituting 38 through 41 in 37, 


|. [42] 
2N 

We notice now that for any given 
matrix of criterion measures if the trace 
of E’E isa minimum, or, what amounts to 
the same thing, if y is a minimum, then 
the right-hand parentheses on the right 
sides of 36 and 42 must be a maximum, 
since, for any given set of criteria, the 
intercorrelations among them are fixed. 
We may then take either of these right- 
hand terms as an index of the differential 
prediction efficiency of a given set of pre- 
dictors and a given set of criterion vari- 
ables. But the term in the right paren- 
theses in 42 is simply the difference 
between the average predicted criterion 
variance and covariance, which when mul- 
tiplied by (V—1) we proposed as the 
index of differential prediction efficiency. 
Since the left side of 42 cannot be negative, 
it is also clear that the average predicted 
variance less the average predicted covari- 


ance cannot be greater than unity less the 

average criterion intercorrelation. 
Designating the index of differential 

prediction efficiency by @ we have then 


g=1'D,1-— 


[43] 


or alternately 
B. The Predictor Selection Formulas 


It is clear, then, that for any given set 
of predictor and criterion variables the 
matrix of regression vectors 8 which will 
yield the highest index of differential 
prediction efficiency is the one given by 
18, which will also yield the most accurate 
estimates in the least-square sense of the 
separate criterion measures. However, the 
problem is to select from a larger set of 
predictors that subset which will yield a 
¢ as given in 44 as large or larger than the 
¢ obtained from any other subset of equal 
size. It seems probable that for a given 
sample of data the only way that the sub- 
set of specified size, yielding the maximum 
¢, could be found is actually to calculate 
the ¢’s for all possible sets of the size 
specified. (It is beyond the scope of this 
development to consider the problems in- 
volving sampling fluctuations of the @’s.) 
However, suppose we select first the single 
predictor with the largest @. To this we 
add the predictor which yields the greatest 
increment to the ¢. The third predictor 
selected is the one which yields the largest 
increment to the @ of the first two. This 
procedure is continued until the subset of 
desired size has been selected. We shall 
assume that this procedure of selecting 
one predictor at a time will yield a subset 
whose ¢ is sufficiently close for practical 
purposes to that of the subset of equal 
size with the largest @. 

We begin therefore by deriving recur- 
sion formulas based on Equations 26 and 


26 PAUL HORST 


43, but involving a sraaller number of pre- 
dictor variables than the total number 
available. Let us assume that we have al- 
ready selected i predictors and calculated 
their @; and that we wish to find which of 
the remaining N —i predictors will yield 
the largest increment ,:4,4 to ¢;. We let 


=the (iXi) matrix of intercor- 
relations of the i selected pre- 
dictors arranged in the order of 
selection, 

T )=the (¢XN) matrix of correla- 
tions of the 7 predictors with 
the N criteria arranged in the 
order of selection, 

£ci4e =the matrix of intercorrelations 
of the i selected predictors and 
an unselected predictor k, 

T (4 = the matrix of correlations of the 
i selected predictors and pre- 
dictor k with the N criteria, 

column vector of correla- 
tions of the first i selected pre- 
dictors with predictor k, 

T,=the column vector of correla- 
tions of predictor k with the V 
criteria, 

Ci =the covariance matrix of pre- 
dicted criterion scores derived 
from the ¢ selected predictors, 

C the covariance matrix of pre- 
dicted criterion scores derived 
from the i selected predictors 
and predictor k. 


According to the above definitions, 


ton 
i+k) = 
gue 1 


T (i+n) = ( my ). 


Using Equation 26 we have 
Cw =T "Tw, 


[46] 


l47| 
and 


[48| 


C = T cis ey 'T 
If now we define? 


it can be proved that 


° 
— Bi 
( 1) 
+ 


- [50] 
1— Bore 


Substituting 46 and 50 in 48 we have 
=T Bay 

T — Beye’ T wy) 

Suppose now we let 

iTe= Te—T 


[s3] 
Substituting 47, 52, and 53 in 51, 
iT’ 


iRkk 


From 43 we write 


[ss] 


and 


$04) = 


From 54, 55, and 56, 


(1’ x)? 
iT,’ 


= Oi : 


Suppose now we let 


* Not to be confused with the 8 defined in Equa- 
tion 18 or 79. 


pat. [56] 
= 


DIFFERENTIAL PREDICTION 


Equation 58 corresponds to 24 and 29 of 
Part II while 59 corresponds to 25 and 
30 of Part IT. 

From 57, 58, and 59 


= Git ide. [60 | 


Equation 60 corresponds to Equation 31 
of Part IT. 

Equation 60 indicates that the crucial 
requirement for the selection procedure is 
that recursion formulas be provided for 
the computation of the ,4, increments. 

First, let us consider the equation anal- 
ogous to 52, assuming that 7 predictors 
plus predictor k have been selected and 
that we wish to investigate the unselected 
predictor p. This would give us 


[6r| 


Corresponding to Equation 49 we have 


[62] 


Now evidently 


( ), 


where gx, is the correlation between pre- 
dictors k and p. Substituting 50 and 63 in 
62, 


[63] 


° 


From 49 and 64 


We let 
and 


- 

Substituting 46, 65, and 67 in 61, 


From 68 
(i+k)Z p= (T,- T 


[69] 


Now since 52 is general for any value of k, 
we can write 


iT,= T»- T . 


[70] 
Substituting 52 and 70 in 69, 
(+k) p= iT Grp. 


(71] 


If now we assume that predictor k was the 
(i+1)th variable selected, let k=w, and 
write 71 in matrix form to include all 
values of p from 1 to m we have 


=iT— Gu iT 


where is the wth row from and 
iG. is a column vector, Equation 72 cor- 
responds to 23 and 28 of Part IT. 

Our next step is to get recursion for- 
mulas for the igi, values given by 66. Sup- 


Boye 


i+k)p= 


27 
[08] 
: 
| 


28 PAUL HORST 


pose we have selected i predictors, the 
last one being predictor w. We wish then 
to find all ¢:41)gep values. From 66 we write 


[73] 


Using 53, 63, 65, and 66 with appropriate 
subscripts in 73, 


(i+1)Rkp = [ —1) 


= Bkp— Brink p— (gwp— Bri) wo Bi) p) 


Using 66 in 75 


i ww 


We let 


the (nXn) matrix for all values 
of k and p on the left of 76, 
ig=the (nXn) matrix for all values 
of k and p for the first term on 
the right of 76, 
i£.w=the wth column vector of jg. 


Analogous to 67 we define 


Go= 77] 
Using 77 and the definitions just given 
and assuming predictor w as the ith pre- 
dictor selected, we rewrite 76 as 


[78] 


Equation 77 corresponds to Equations 21 
and 26 of Part II, while Equation 78 cor- 
responds to Equations 22 and 27 of Part 
IT. 

The selection process begins with the 
iT matrix of validity coefficients. By 
means of 58 for the case of i=1 we com- 
pute ,S,? for all values of &. Since all di- 


i41)8 = — i8.w Gy’. 


agonal values of ig are unity, 59 is the 
same as 58. 

The. predictor with the highest ,5,? 
value is the first one selected. 

By means of 78 a og matrix is computed 
where w is the first or eth variable se- 
lected. Since all diagonal elements of ig are 


iRwk 


[74] 


[7s] 


ikww 


unity, G» given by 77 is the same as 
i£.w. It is easily demonstrated that the wth 
row and column of «4g must be zero 
and that it must have all rows and col- 
umns vanish which correspond to the 
selected variables. 

A »T matrix is computed by means of 
Equation 72. Again, we note that Gy 
= ig. for i=1. Since the wth element of 
iG, must be unity, it is easy to see that 
the wth row of ¢41)T in 72 is zero. It also 
follows that this matrix must have 7 van- 
ishing columns. 

By means of Equations 58 and 59, a 
second predictor is chosen. The routine 
continues until the desired number of pre- 
dictors is selected. 


C. The Regression Matrix Formulas 


We shall now develop the rationale for 
solving for the matrix 8 of least-square 
regression vectors outlined in Part II, 
Section B. Cet us suppose that the total 
number of selected predictors is i. Then 
we may write the matrix of regression vec- 
tors 8) by analogy with Equation 18 as 


[79] 


As we shall see, however, it is not neces- 
sary to compute the inverse of gi) and 


Bui) Tw. 


— = 


DIFFERENTIAL PREDICTION 


then premultiply this inverse into T,;. 
We may utilize computations which are 
available from the predictor selection 
operations which will greatly reduce the 
labor of computing the regression vectors. 
From 72 


Gw iT = iT — [80 | 


We consider only the i selected predictors 
and let i and w go from 1 to i, and a to w, 
respectively, where the latter sequence in- 
dicates the order of selection of the pre- 
dictors. We let :y» be a vector of the ele- 
ments from ,G, corresponding to the se- 
lected predictors and in the order selected. 
Then from 80 

| [81] 


[82] 


[83] 


12 a.’ =1T 27 (i) 


=2T i) — 37 


= iT — T 

We let 
= * * 


iT w), 
It can readily be shown that 

[84] 


Summing both sides of 81 and using 82, 
83, and 84 we have 


T =O. 


Vay’ =1T [85] 


Suppose now we let 


° 


It can be shown because of the method by 
which y was computed that 


£1) = Dw i). [87] 
Substituting 87 in 79, 


From 8s, 
Using 88 in 89, 
(Dw iy) May = [90] 


Now the matrix on the left of go, whose in- 
verse is required, is a triangular matrix 
with all vanishing elements below the di- 
agonal. 

Suppose we let 


(Duy) 


[89] 


lor] 
From 91, 


[92] 


Now because of the definition of yj) its 
diagonal elements are all unity. We let L 
be a matrix of the supradiagonal elements 
of y« with signs reversed. This is the 
definition of L given by Equations 32 
through 34 of Part II. Then 


¥a=I-L. lo3] 


Substituting 93 in 92, 


F=LF+D,~. 


The method for computing the ele- 
ments of F’ outlined in Equations 36 
through 40 of Part II is based on Equa- 
tion 94. Since L has zeros in and below 
the diagonal, the diagonals of F are given 
by D,~' as defined in 86. This solution 
corresponds to Equations 36 of Part IT. 
It is possible, then, to solve for each re- 
maining element of F in terms of elements 
already solved for and the appropriate 
elements of L. 

The check for F given by 41 and 42 of 
Part II is readily proved. The first row of 
£ i) is also the first row of Duy.) or If 
we let 


e;’=(1, eee 
then 


= ey’, 


PAUL HORST 


Baja 


Once F has been solved for we can solve 
for 8, by means of the equation 


=Buiy, [96] 


which corresponds to Equation 45 of 
Part IT. 


D. The Multiple Correlation Formula 


Finally, we prove Equation 48 of Part 
II for the multiple correlation squared. 
If we specify by a, b, - - - w, respectively, 
the predictor variables in the order of 
their selection, we can write, because of 54, 


, 
a 


Because of 82 and 86 we can rewrite 97 as 


[98] 
But the diagonal elements of C,,;, are the 
variances of the predicted criteria, which 
are precisely the multiple correlations 
squared, If we let row k of ¢,;)’ be 


and the &th diagonal of Cy, be Cre, we 
have 


Ci) = 


Dot [100 | 


which corresponds to Equation 48 of 
Part II. 


V. SUMMARY 


This paper presented (a) a brief dis- 
cussion of the differential prediction prob- 
lem, and (6) a detailed description of the 
computational methods developed for se- 
lecting the differential predictors and solv- 
ing for the regression vectors and multiple 
correlations. It also includes (c) a numer- 


ical illustration of the procedure, and (d) 
the mathematica) derivation of the for- 
mulas developed. 

Tn the discussion of the differential pre- 
diction problem it was pointed out that 
we should have a single classification bat- 
tery of tests that, by means of differential 
weighting procedures, would enable us to 
predict success in each of a wide variety 
of activities. It was found necessary to 
define a suitable index of differential pre- 
diction efficiency for the battery. This in- 
dex was defined as a simple function of the 
variances of the predicted difference scores 
of all possible pairs of criterion variables. 
The larger this variance, the greater the 
differential prediction efficiency of the 
battery. It was shown mathematically 
that an equivalent definition can be ob- 
tained by maximizing the difference be- 
tween the average variance and the aver- 
age covariance of the predicted criterion 
scores. 

The labor involved in an exact solution 
to the problem of selecting that particular 
subset of predictors which would maxi- 
mize the index of differential prediction 
would be prohibitive. However, an alter- 
native technique using an iterative pro- 
cedure was developed for selecting, pro- 
gressively, the predictor which when com- 
bined with the previously selected set, will 
yield the highest index of differential pre- 
diction. The process is terminated when 
some arbitrarily designated number of 
predictors has been so selected. 

The numerical example presented in the 
computational section consists of opera- 
tions on a matrix of intercorrelations of 
eight predictor variables and a matrix of 
ten criterion variables. 

In addition to the development of the 
rationale of the selection procedure, fur- 
ther equations were developed for obtain- 
ing the matrix of regression vectors and 
the multiple-correlation coefficients with- 


30 
or 
28 bb 
Te 
[or] 
ikww 


DIFFERENTIAL PREDICTION 31 


out obtaining the inverse of the predictor ©. Lop, F. M. Notes on a problem of multiple 
intercorrelation matrix classification. Psychometrika, 1952, 17, 297- 


REFERENCES 


. Broopen, H. E. An approach to the problem of 
differential prediction. Psychometrika, 1946, 
II, 139-154. 

. Brocpen, H. E. Increased efficiency of selection 
resulting from replacement of a single pre- 
dictor with several differential predictors. 
Educ. psychol. Measmt, 1951, 11, 173-195. 

. Dwyer, P. S. Linear computations. New York: 
Wiley, 1951. 

. Gutiiksen, H. Theory of mental tests. New 
York: Wiley, 1950. 

. Horst, P., & Smirn, S. The discrimination of 
two racial samples. Psychometrika, 1950, 15, 
271-289. 


304. 


. Moitenkoprr, W. G. Predicted differences and 


differences between predictions. Psycho- 
metrika, 1950, 15, 409-417. 


. Sreap, W. H., & Suartie, C. L. Occupational 


counseling techniques. New York: American 
Book Co., 1940. 


. TuornpikeE, R. L. The problem of classification 


of personnel. Psychometrika, 1950, 215- 
235. 


. Voraw, D. F., Jr. Methods of solving some per- 


sonnel-classification problems. Psychometrika, 
1952, 17, 255-2006. 


. Wesman, A. G., & Bennett, G. K. Problems of 


differential prediction. Educ. psychol. Measmt, 
1951, 11, 265-272. 


(Accepted for publication March 16, 1954) 


5 


| 
OF 
3 


